Using features of dynamic networks to guide treatment selection and outcome prediction

Simulation Study Visualizations

Authors
Affiliations

Björn S. Siepe

University of Marburg

Matthias Kloft

University of Marburg

Fridtjof Petersen

University of Groningen

Yong Zhang

University of Groningen

Laura F. Bringmann

University of Groningen

Daniel W. Heck

University of Marburg

Published

April 25, 2025

1 Background

This document contains code to reproduce the visualizations of the simulation study in the manuscript. It also contains additional results not shown in the main manuscript.

2 Preparation

Load relevant packages:

Show the code
library(tidyverse)
library(SimDesign)
library(cowplot)
library(showtext)
library(sysfonts)
library(ggokabeito)
library(janitor)
library(ggh4x)
library(pander)
library(MetBrewer)
library(here)
library(lm.beta)

source(here::here("scripts", "00_functions.R"))

Load the simulation results:

Show the code
sim_results <- readRDS(here::here("output", "sim_results.RDS"))

# Create a cut version of the simulation results without MCMC diagnostics
sim_res_cut <- sim_results |>
  select(-contains("rhat")) |>
  select(-contains("divtrans"))

3 Data-Generating Processes

These are the data generating matrices that we used for our fixed effects:

VAR matrix: \[\begin{bmatrix} 0.41 & 0.105 & 0.105 & 0.105 & 0.105 & 0.105 \\ 0.105 & 0.35 & 0.075 & 0 & 0.075 & 0.075 \\ 0.105 & 0 & 0.35 & 0.075 & 0 & 0.075 \\ 0.105 & 0.075 & 0.075 & 0.35 & 0 & 0.075 \\ 0.105 & 0.075 & 0.075 & 0.075 & 0.35 & 0 \\ 0.105 & 0.075 & 0.075 & 0.075 & 0.075 & 0.35 \end{bmatrix} \quad\quad\] Innovation covariance matrix: \[\begin{bmatrix} 1 & 0.188 & 0.188 & 0.188 & 0.188 & 0.188 \\ 0.188 & 1 & 0.15 & 0.15 & 0 & 0.15 \\ 0.188 & 0.15 & 1 & 0.15 & 0.15 & 0.15 \\ 0.188 & 0.15 & 0.15 & 1 & 0.15 & 0.15 \\ 0.188 & 0 & 0.15 & 0.15 & 1 & 0 \\ 0.188 & 0.15 & 0.15 & 0.15 & 0 & 1 \end{bmatrix}\]

4 Deviations from Preregistration

We had multiple minor deviations from our preregistration, which mostly occurred due to errors on our side. We do not think that any of these decreases the informativeness of our study.

Also, after running the simulation study, we realized that our Stan code had an error in one line, where we standardized reg_slope_density_z[9] with reg_slope_density[8]. We corrected this error, and did not use this output anyway.

4.1 Overview Table

Deviation Explanation
Updated BmlVAR formula to fix subscripts and notation The preregistered model formula omitted some necessary subscripts and used inconsistent distributional notation. These were corrected for clarity and correctness in the final version.
Renamed centrality predictor from d to c The label c was chosen for clarity.
Incorrect prior notation for second step regression residual variance We accidentally noted a t-prior on \(log(sigma)\), but a half-t prior on \(\sigma\) was used in both steps for consistency across models and had already been implemented in the code. We fixed the notation to align with our actual implementation
Omitted sum over individuals in centrality selection accuracy formula The preregistered formula did not include a summation over individuals, which is necessary for the proportion.
200 instead of 100 simulation repetitions As we explained in the manuscript, we accidentally calculated that we needed 200 instead of the actual 100 simulation repetitions to achieve the desired worst-case MCSE. We kept the 200 repetitions as they are informative and increase precision.

5 Data Wrangling

Prepare dataframe for visualization by giving proper names, removing unnecessary columns, and pivoting longer:

Show the code
sr_edit <- sim_res_cut |> 
  mutate(
    dgp = factor(n_tp, levels = c(60, 120)),
    n_id = factor(n_id, levels = c(75, 200))
    ) |>
  # remove "reg_" from all column names
  rename_with(~str_remove(., "reg_")) |> 
  # remove everything before a "." in the column names
  rename_with(~str_remove(., ".*\\.")) |> 
  dplyr::select(-c("REPLICATIONS", "SIM_TIME", "SEED", "COMPLETED", "WARNINGS", "RAM_USED")) |>
  dplyr::select(-c("heterogeneity")) |> 
  # pivot longer except conditions cols
  pivot_longer(cols = -c("dgp", "n_tp", "n_id"), 
               names_to = "measure", 
               values_to = "value") |> 
  mutate(measure = str_replace(measure, "power_reg", "powerreg"),
         measure = str_replace(measure, "powertwoside_reg", "powertwosidereg"),
         measure = str_replace(measure, "poweroneside_reg", "poweronesidereg"),
         measure = str_replace(measure, "rmse_reg", "rmsereg"),
         measure = str_replace(measure, "bias_reg", "biasreg"),
         measure = str_replace(measure, "mse_reg", "msereg")) |>
  filter(!measure %in% c("strength_sd", "outstrength_sd", "instrength_sd")) |> 
  separate_wider_delim(measure, 
                       delim = "_",
                       names = c("pm", "outcome", "method", "summary")) |> 
  mutate(method = case_when(
    method == "gvar" ~ "GVAR",
    method == "gimme" ~ "GIMME",
    method == "mlvar" ~ "mlVAR",
    method == "bmlvar" ~ "BmlVAR"
  )) |>
  # treat method as factor and order 
  mutate(method = factor(method, levels = c("GVAR", "GIMME", "mlVAR", "BmlVAR"))) |>
  mutate(outcome = case_when(
    outcome == "beta" ~ "Temporal",
    outcome == "pcor" ~ "Contemporaneous",
    .default = outcome
  )) |>
  group_by(dgp, n_tp, n_id, pm, outcome, method) |>
  pivot_wider(names_from = summary, values_from = value) |> 
  ungroup()

Prepare different colors and settings for visualization:

Show the code
meth_colors <- set_names(MetBrewer::met.brewer("Johnson")[c(1,2,4,5)],
                         c("GVAR", "mlVAR", "GIMME", "BmlVAR"))

6 Check Warnings and Errors

6.1 Errors

Check if any errors that led to non-convergence occurred:

Show the code
# load results before resummarization, which contain potential error messages
sim_full_pre_resum <- readRDS(here("output", "sim_full.rds"))

sim_errors <- SimExtract(sim_full_pre_resum, "errors")

There were 0 error messages.

6.2 Warnings

We extract all warnings and rename them properly:

Show the code
sim_warnings <- SimExtract(sim_full_pre_resum, what = "warnings")

sim_warnings |> 
  select(!c(heterogeneity, strength_sd, outstrength_sd, instrength_sd)) |> 
  rename("Deprecated dplyr function" = "WARNING:  c(\"Warning in NULL : `funs()` was deprecated in dplyr 0.8.0.\", \"Warning in NULL : Please use a list of either functions or lambdas: \\n\\n  # Simple nam",
         "Bulk ESSlow" = "WARNING:  Warning in NULL : Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.\nRunning the chains for more i",
         "Potential Stan sampling problems" = "WARNING:  Warning in NULL : Examine the pairs() plot to diagnose sampling problems\n",
         "Tail ESS low" = "WARNING:  Warning in NULL : Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.\nRunning the chains",
         "NA R-hat" = "WARNING:  Warning in NULL : The largest R-hat is NA, indicating chains have not mixed.\nRunning the chains for more iterations may help. See\nhttps://mc-stan.org/mi",
         "Divergent transition" = "WARNING:  Warning in NULL : There were 1 divergent transitions after warmup. See\nhttps://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup\nto find",
         "Exceeded maximum treedepth" = "WARNING:  Warning in NULL : There were 497 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See\nhttps://mc-stan.org",
         "Rothman algorithm warning" = "WARNING:  Warning in Rothmana(data_l, data_c, lambdas$beta[i], lambdas$kappa[i], regularize_mat_beta = regularize_mat_beta,     regularize_mat_kappa = regulariz",
         "Variable naming warning" = "WARNING:  Warning in validityMethod(object) : The following variables have undefined values:  reg_intercept_z[1],The following variables have undefined values: ") |> 
  knitr::kable()
dgp n_id n_tp Deprecated dplyr function Bulk ESSlow Potential Stan sampling problems Tail ESS low NA R-hat Divergent transition Exceeded maximum treedepth Rothman algorithm warning Variable naming warning
sparse 75 60 60 200 100 200 28 100 3 4759 35
sparse 200 60 0 200 1 199 15 1 0 12834 23
sparse 75 120 0 199 26 197 5 26 0 4555 2
sparse 200 120 0 200 0 163 3 0 0 12042 32

Unfortunately, we removed warnings from mlVAR during our simulation. However, in our additional simulations in 06_additional_mlvar_simulation.qmd, we noticed that warning messages from mlVAR were only related to the renaming of certain variables, so we assume that we did not miss any substantial warnings.

The warnings from the Rothman algorithm in graphicalVAR are substantial in quantity, but they also occured regulary in simulations where the point estimates were relatively accurate. We therefore did not use them to exclude any interations.

7 Point Estimates

Plot point estimate recovery as mean squared error (MSE) which is helpful to understand the overall performance of the different methods.

Show the code
plot_mse <- sr_edit |> 
  mutate(mcse = ifelse(is.na(mcse), 0, mcse)) |>
  mutate(outcome = factor(outcome, levels = c("Temporal", "Contemporaneous"))) |>
  filter(pm == "mse") |> 
  ggplot(aes(x = method, 
             y = mean, 
             colour = method
             )) +
  # add vertical line between methods
  geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
  geom_errorbar(aes(ymin = mean - 1*mcse,
                            ymax = mean + 1*mcse),
                        width = .8,
                 position = position_dodge(0.7),
                 show.legend = FALSE)+
  geom_point(position = position_dodge(0.7), 
             size = 1.2) +
  ggh4x::facet_nested(n_id ~ outcome + n_tp,
                      axes = "all",
                      remove_labels = "y") +
  scale_x_discrete(guide = guide_axis(angle = 90))+
  scale_y_continuous(expand = c(0, 0), limits = c(0,0.03)) +
  scale_color_manual(values = meth_colors) +
  theme_centrality()+
  theme(legend.position = "none",
        text = element_text(size = 22))+
  labs(title = "",
       x = "Method",
       colour = "Method",
       y = "MSE of Network Estimation")

plot_mse

Plot Bias:

Show the code
plot_bias <- sr_edit |> 
  mutate(mcse = ifelse(is.na(mcse), 0, mcse)) |>
  mutate(outcome = factor(outcome, levels = c("Temporal", "Contemporaneous"))) |>
  filter(pm == "bias") |> 
  ggplot(aes(x = method, 
             y = mean, 
             colour = method
             )) +
  # add vertical line between methods
  geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
  geom_errorbar(aes(ymin = mean - 1*mcse,
                            ymax = mean + 1*mcse),
                        width = .8,
                 position = position_dodge(0.7),
                 show.legend = FALSE)+
  geom_point(position = position_dodge(0.7), 
             size = 1.2) +
  ggh4x::facet_nested(n_id ~ outcome + n_tp,
                      axes = "all",
                      remove_labels = "y") +
  scale_x_discrete(guide = guide_axis(angle = 90))+
  scale_y_continuous(expand = c(0, 0), limits = c(-0.1,0.1)) +
  scale_color_manual(values = meth_colors) +
  theme_centrality() +
  theme(legend.position = "none",
        text = element_text(size = 22))+
  labs(title = "",
       x = "Method",
       colour = "Method",
       y = "Bias of Network Estimation")

plot_bias

8 Centrality

8.1 Plot most central identical

Show the code
plot_mostcentral <- sr_edit |> 
  # if mcse is missing, set to 0
  mutate(mcse = ifelse(is.na(mcse), 0, mcse)) |>
  mutate(outcome = factor(outcome, levels = c("Temporal", "Contemporaneous"))) |>
  filter(pm == "mostcent") |> 
  mutate(n_tp = paste0("t = ", n_tp)) |>
  mutate(n_tp = factor(n_tp, levels = c("t = 60", "t = 120"))) |>
  mutate(n_id = paste0("n = ", n_id)) |>
  mutate(n_id = factor(n_id, levels = c("n = 75", "n = 200"))) |>
  ggplot(aes(x = as.factor(n_tp), 
             y = mean,
             group = method,
             colour = method
             )) +
  # horizontal line at chance level of 1/6
  geom_hline(yintercept = 1/6, 
             linetype = 2, 
             alpha = .7)+
  # add vertical line between methods
  geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
  geom_errorbar(aes(ymin = mean - 1*mcse,
                            ymax = mean + 1*mcse),
                        width = .5,
                 position = position_dodge(0.8),
                 show.legend = FALSE)+
  geom_point(position = position_dodge(0.8), 
             size = 1.4) +
  # add mean as text above errorbar
    geom_text(aes(label = sub("^(-?)0.", "\\1.",sprintf("%.2f", mean))),
              position = position_dodge(0.8),
              vjust = -1.0,
              size = 4,
              show.legend = FALSE) +
  ggh4x::facet_nested(n_id ~ outcome,
                      axes = "all",
                      remove_labels = "y") +
  # scale_x_discrete(guide = guide_axis(angle = 90))+
  scale_y_continuous(expand = c(0, 0), limits = c(0,1.1)) +
  scale_color_manual(values = meth_colors) +
  theme_centrality() +
  theme(legend.position = "bottom",
        text = element_text(size = 24),
        
        )+
  labs(title = "",
       x = "Time Points",
       colour = "",
       y = "Proportion of Correct Central Nodes")

ggsave("plot_mostcentral.pdf", plot_mostcentral, height = 6 * 1.1, width = 9 * 1.1,
       path = here::here("figures/"), device = "pdf")

plot_mostcentral

8.2 Plot rank correlation of centrality measures

We were interested in the performance of centrality estimation, which we assessed with regard to rank-order performance. As we expected centrality estimates to be biased downwards in some methods due to sparsity assumptions, we computed the rank-order consistency of individual centrality estimates. To do so, we computed the point estimate of network centrality \(\hat{c}\) per individual in a data set, which ignored estimation uncertainty. We then estimated the Spearman rank correlation \(\hat{\rho}_{i}\) in repetition \(i\) of these estimates with the true network centrality \(c\) and calculated its average across repetitions as: \[\begin{align*} \widehat{\rho} = \frac{\sum_{i=1}^{n_{\text{sim}}} \hat{\rho}_i}{n_{\text{sim}}} \end{align*}\] We calculated the MCSE of this correlation via bootstrapping.

Show the code
plot_rankcor <- sr_edit |> 
  # if mcse is missing, set to 0
  mutate(mcse = ifelse(is.na(mcse), 0, mcse)) |>
  # uselessly used temp and cont instead of beta and pcor here
  mutate(outcome = case_when(
    outcome == "temp" ~ "Temporal",
    outcome == "cont" ~ "Contemporaneous"
  )) |>
  mutate(outcome = factor(outcome, levels = c("Temporal", "Contemporaneous"))) |>
  filter(pm == "rankcor") |> 
  ggplot(aes(x = method, 
             y = mean, 
             colour = method
             )) +
  # add vertical line between methods
  geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
  geom_errorbar(aes(ymin = mean - 1*mcse,
                            ymax = mean + 1*mcse),
                        width = .8,
                 position = position_dodge(0.7),
                 show.legend = FALSE)+
  geom_point(position = position_dodge(0.7), 
             size = 1.2) +
  ggh4x::facet_nested(n_id ~ outcome + n_tp,
                      axes = "all",
                      remove_labels = "y") +
  scale_x_discrete(guide = guide_axis(angle = 90))+
  scale_y_continuous(expand = c(0, 0), limits = c(0,1.1)) +
  scale_color_manual(values = meth_colors) +
  theme_centrality() +
  theme(legend.position = "none",
        text = element_text(size = 22))+
  labs(title = "",
       x = "Method",
       colour = "Method",
       y = "Centrality Rank Correlation")


ggsave("plot_rankcor.pdf", plot_rankcor, height = 12, width = 16,
       path = here::here("figures/"), device = "pdf")

plot_rankcor

9 Regression

9.1 Power of Regression

9.1.1 In- and Outstrength combined

Prep data

Show the code
# Split data set into three based on true effect
power_df <- sr_edit |> 
  filter(!str_detect(pm, "poweroneside")) |> 
  mutate(pm = str_replace(pm, "powertwoside", "power")) |> 
  filter(str_detect(pm, "power")) |> 
  mutate(outcome = case_when(
    # outcome == "tempdens" ~ "Temporal\nDensity",
    # outcome == "contdens" ~ "Contemporaneous\nDensity",
    outcome == "outstrength" ~ "Temporal\nOutstrength",
    outcome == "strength" ~ "Contemporaneous\nStrength",
    outcome == "instrength" ~ "Temporal\nInstrength"
  )) |>
  # For now, remove contemporaneous strength
  filter(outcome != "Contemporaneous\nStrength") |>
  mutate(n_id = paste0("n = ", n_id)) |>
  mutate(n_id = factor(n_id, levels = c("n = 75", "n = 200"))) |>
  mutate(n_tp = paste0("t = ", n_tp)) |>
  mutate(n_tp = factor(n_tp, levels = c("t = 60", "t = 120"))) |>
  # split pm into two columns, take last number into new column
  separate_wider_delim(pm, delim = "reg", names = c("pm", "true_effect"))

power_list <- split(power_df, power_df$true_effect)

Function to create the plots:

Show the code
# Function to create the plot for a given dataframe
power_plot <- function(df) {
  ggplot(df, aes(x = n_tp, 
                 y = mean, 
                 colour = method,
                 fill = method,
                 group = method)) +
    geom_errorbar(aes(ymin = mean - 1 * mcse,
                      ymax = mean + 1 * mcse),
                  width = 0.5,
                  position = position_dodge(0.7),
                  show.legend = FALSE) +
    # add vertical
    geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
    geom_point(position = position_dodge(0.7), 
               size = 1.4) +
    ggh4x::facet_nested(n_id ~ outcome,
                        axes = "all",
                        remove_labels = "y") +
    # scale_x_discrete(guide = guide_axis(angle = 90)) +
    scale_y_continuous(expand = c(0, 0), limits = c(-0.1, 1.1)) +
    scale_color_manual(values = meth_colors) +
    theme_centrality() +
    theme(legend.position = "none",
          strip.text.x.top = ggplot2::element_text(size = rel(0.85)),
          ggh4x.facet.nestline = element_line(colour = "#6d6d6e"),
          axis.text = element_text(size = 18)) +
    labs(title = "",
         x = "",
         colour = "",
         fill = "",
         group = "",
         y = "Detection Rate")
}

Create the plots and combine them with patchwork

Show the code
plot_power_list <- lapply(power_list, power_plot)

# add legend to the third plot
plot_3_legend <- plot_power_list[[3]] + 
  theme(legend.position = "bottom",
        text = element_text(size = 25))

# extract legend
legend <- cowplot::get_plot_component(plot_3_legend, 'guide-box-bottom', return_all = TRUE)

# add legend to cowplot

plot_power_combined <- cowplot::plot_grid(plot_power_list[[1]],
                                          plot_power_list[[2]],
                                          plot_power_list[[3]],
                                          legend,
                                          ncol = 1,
                                          nrow = 4,
                                          rel_heights = c(1, 1, 1, 0.1),
                                          labels = c("True Effect: 0", 
                                                     "True Effect: 0.2",
                                                     "True Effect: 0.4",
                                                     ""),
                                          label_fontfamily = "news",
                                          label_size = 18)




ggsave("plot_power_combined.pdf", plot_power_combined, height = 16* 0.95, width = 16* 0.8,
       path = here::here("figures/"))
ggsave("plot_power_combined.svg", plot_power_combined, height = 16* 0.95, width = 16* 0.8,
       path = here::here("scripts", "simulation_viz_figures"))

How large was the difference between instrength and outstrength?

Show the code
power_df |> 
  pivot_wider(id_cols = c(dgp, n_id, n_tp, pm, true_effect, method), 
              names_from = outcome, 
              values_from = c(mean, mcse)) |> 
  filter(true_effect != 0) |> 
  janitor::clean_names() |> 
  mutate(mean_diff = mean_temporal_instrength - mean_temporal_outstrength) |> 
  summarize(mean_mean_diff = mean(mean_diff)) |> 
  knitr::kable()
mean_mean_diff
0.0521875

9.2 Power and Bias for Outstrength

This reproduces the simulation figure in the manuscript.

Show the code
power_bias_reg_combined_df <- sr_edit |> 
  mutate(pm = str_replace(pm, "powertwoside", "power")) |> 
  filter(!str_detect(pm, "poweroneside")) |>
  filter(str_detect(pm, "power") | str_detect(pm, "biasreg")) |> 
    mutate(outcome = case_when(
    outcome == "outstrength" ~ "Temporal\nOutstrength",
    outcome == "strength" ~ "Contemporaneous\nStrength",
    outcome == "instrength" ~ "Temporal\nInstrength"
  )) |>
  filter(outcome == "Temporal\nOutstrength") |> 
  mutate(n_id = paste0("n = ", n_id)) |>
  mutate(n_id = factor(n_id, levels = c("n = 75", "n = 200"))) |>
  mutate(n_tp = paste0("t = ", n_tp)) |>
  mutate(n_tp = factor(n_tp, levels = c("t = 60", "t = 120"))) |>
  # split pm into two columns, take last number into new column
  separate_wider_delim(pm, delim = "reg", names = c("pm", "true_effect")) |> 
  mutate(method = factor(method, levels = c("GVAR", "GIMME", "mlVAR", "BmlVAR"))) |> 
  mutate(true_effect = paste0("True Effect: .", true_effect)) |> 
  mutate(pm = case_when(
    str_detect(pm, "power") ~ "Detection Rate",
    str_detect(pm, "bias") ~ "Bias"
  ))

plot_power_bias_outstrength <- power_bias_reg_combined_df |>
  ggplot(aes(x = n_tp, 
             y = mean, 
             colour = method,
             fill = method,
             group = method))+
  geom_errorbar(aes(ymin = mean - 1 * mcse,
                      ymax = mean + 1 * mcse),
                  width = 0.5,
                  position = position_dodge(0.8),
                  show.legend = FALSE) +
    # add vertical
    geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
    geom_point(position = position_dodge(0.8), 
               size = 1.4) +
    # add mean without leading zero
    geom_text(aes(label = sub("^(-?)0.", "\\1.",sprintf("%.2f", mean))),
              position = position_dodge(0.8),
              vjust = -0.6,
              size = 5,
              show.legend = FALSE) +
    ggh4x::facet_nested(true_effect + n_id ~ pm,
                        axes = "all",
                        scales = "free_y",
                        independent = "y") +
    facetted_pos_scales(
      y = list(pm == "Bias" ~ scale_y_continuous(expand = c(0, 0), limits = c(-.45, .2)),
               pm == "Detection Rate" ~ scale_y_continuous(expand = c(0, 0), limits = c(-0.1, 1.1)))
    )+
    labs(
      title = "",
      x = "Time Points",
      colour = "",
      fill = "",
      group = "",
      y = ""
    ) +
    scale_color_manual(values = meth_colors) +
    theme_centrality() +
    theme(legend.position = "bottom",
          ggh4x.facet.nestline = element_line(colour = "#6d6d6e"),
          axis.text.x = element_text(size = 22),
          axis.text.y = element_text(size = 22),
          axis.title.x = element_text(size = 24),
          # increase all legend element size
          legend.text = element_text(size = 24),
          legend.title = element_text(size = 24),
          legend.key.size = unit(1.6, "lines"),
          strip.text = ggplot2::element_text(size = 24),
          strip.text.x.top = ggplot2::element_text(size = 24)
          )

ggsave("plot_power_bias_outstrength.pdf", 
       plot_power_bias_outstrength, height = 16* 0.95, width = 16* 0.8,
       path = here::here("figures/"))

9.3 Power and Bias for Contemporaneous Strength

Show the code
power_bias_reg_strength_df <- sr_edit |> 
  mutate(pm = str_replace(pm, "powertwoside", "power")) |> 
  filter(!str_detect(pm, "poweroneside")) |>
  filter(str_detect(pm, "power") | str_detect(pm, "biasreg")) |> 
    mutate(outcome = case_when(
    outcome == "outstrength" ~ "Temporal\nOutstrength",
    outcome == "strength" ~ "Contemporaneous\nStrength",
    outcome == "instrength" ~ "Temporal\nInstrength"
  )) |>
  filter(outcome == "Contemporaneous\nStrength") |> 
  mutate(n_id = paste0("n = ", n_id)) |>
  mutate(n_id = factor(n_id, levels = c("n = 75", "n = 200"))) |>
  mutate(n_tp = paste0("t = ", n_tp)) |>
  mutate(n_tp = factor(n_tp, levels = c("t = 60", "t = 120"))) |>
  # split pm into two columns, take last number into new column
  separate_wider_delim(pm, delim = "reg", names = c("pm", "true_effect")) |> 
  mutate(method = factor(method, levels = c("GVAR", "mlVAR", "GIMME", "BmlVAR"))) |> 
  mutate(true_effect = paste0("True Effect: .", true_effect)) |> 
  mutate(pm = case_when(
    str_detect(pm, "power") ~ "Detection Rate",
    str_detect(pm, "bias") ~ "Bias"
  ))

plot_power_bias_strength <- power_bias_reg_strength_df |>
  ggplot(aes(x = n_tp, 
             y = mean, 
             colour = method,
             fill = method,
             group = method))+
  geom_errorbar(aes(ymin = mean - 1 * mcse,
                      ymax = mean + 1 * mcse),
                  width = 0.5,
                  position = position_dodge(0.8),
                  show.legend = FALSE) +
    # add vertical
    geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 4, 1))+
    geom_point(position = position_dodge(0.8), 
               size = 1.4) +
    # add mean as text above errorbar
    geom_text(aes(label = round(mean, 2)),
              position = position_dodge(0.8),
              vjust = -0.6,
              size = 5,
              show.legend = FALSE) +
    ggh4x::facet_nested(true_effect + n_id ~ pm,
                        axes = "all",
                        scales = "free_y",
                        independent = "y") +
    facetted_pos_scales(
      y = list(pm == "Bias" ~ scale_y_continuous(expand = c(0, 0), limits = c(-.45, .2)),
               pm == "Detection Rate" ~ scale_y_continuous(expand = c(0, 0), limits = c(-0.1, 1.1)))
    )+
    labs(
      title = "",
      x = "Timepoints",
      colour = "",
      fill = "",
      group = "",
      y = ""
    ) +
    scale_color_manual(values = meth_colors) +
    theme_centrality() +
    theme(legend.position = "bottom",
          ggh4x.facet.nestline = element_line(colour = "#6d6d6e"),
          axis.text = element_text(size = 18),
          # increase all legend element size
          legend.text = element_text(size = 18),
          legend.title = element_text(size = 18),
          legend.key.size = unit(1.5, "lines"),
          strip.text = ggplot2::element_text(size = 24),
          strip.text.x.top = ggplot2::element_text(size = 24)
          )

ggsave("plot_power_bias_strength.pdf", 
       plot_power_bias_strength, height = 16* 0.95, width = 16* 0.8,
       path = here::here("figures/"))
ggsave("plot_power_bias_strength.svg", 
       plot_power_bias_strength, height = 16* 0.95, width = 16* 0.8,
       path = here::here("scripts", "simulation_viz_figures"), device = "svg")

9.4 RMSE and Bias of Regression

We create plot for different regression outcomes here:

Show the code
process_data <- function(data, summary) {
  df_clean <- data |> 
    filter(str_detect(pm, summary)) |> 
    mutate(outcome = case_when(
      outcome == "outstrength" ~ "Temporal\nOutstrength",
      outcome == "strength" ~ "Contemporaneous\nStrength",
      outcome == "instrength" ~ "Temporal\nInstrength"
    )) |>
    filter(outcome != "Contemporaneous\nStrength") |> 
    mutate(
      n_id = factor(paste0("n = ", n_id), levels = c("n = 75", "n = 200")),
      n_tp = factor(paste0("t = ", n_tp), levels = c("t = 60", "t = 120"))
    ) |>
    separate_wider_delim(pm, delim = "reg", names = c("pm", "true_effect"))
  
    split(df_clean, df_clean$true_effect)
}

## Generic plotting function
create_regression_plot <- function(df, 
                                   y_label) {
  ggplot(data = df,
         aes(
           x = n_tp,
           y = mean,
           colour = method,
           fill = method,
           group = method
         )) +
    # add horizontal dashed line at zero for bias
    geom_hline(yintercept = 0, linetype = 2, alpha = .4)+
    geom_errorbar(
      aes(ymin = mean - 1 * mcse, ymax = mean + 1 * mcse),
      width = .5,
      position = position_dodge(0.7),
      show.legend = FALSE
    ) +
    geom_point(position = position_dodge(0.7), size = 1.2) +
    ggh4x::facet_nested(n_id ~ outcome,
                        axes = "all",
                        remove_labels = "y") +
    scale_y_continuous(expand = c(0, 0), limits = c(-.45, .2)) +
    scale_color_manual(values = meth_colors) +
    theme_centrality() +
    theme(
      legend.position = "none",
      strip.text.x.top = ggplot2::element_text(size = rel(0.85)),
      ggh4x.facet.nestline = element_line(colour = "#6d6d6e"),
      axis.text = element_text(size = 18)
    ) +
    labs(
      title = "",
      x = "",
      colour = "",
      group = "",
      fill = "",
      y = y_label
    )
}

## Generate and save combined plots
generate_regression_plots <- function(data, summary, y_label, file_name) {
  processed_list <- process_data(data, summary)
  plot_list <- lapply(processed_list, create_regression_plot, y_label = y_label)
  combined_plot <- cowplot::plot_grid(
    plotlist = plot_list, 
    ncol = 1,
    labels = c("True Effect: 0", "True Effect: 0.2", "True Effect: 0.4")
  )
  ggsave(file_name, combined_plot, height = 16 * 0.95, width = 16 * 0.8, 
         path = here::here("figures/"))
}


## RMSE Plot
generate_regression_plots(sr_edit, "rmsereg", "RMSE", "plot_reg_rmse_combined.pdf")

## Bias Plot
generate_regression_plots(sr_edit, "biasreg", "bias", "plot_reg_bias_combined.pdf")

9.5 Variability of Point Estimates

Here, we investigate the variability of point estimates in specific simulation conditions, for which we need the raw results - this part is therefore not reproducible.

Show the code
r3 <- readRDS("~/centrality-uncertainty/sim_full.rds-results_pc04798/results-row-3.rds")

For easier interpretation, we plot the raw estimates instead of the bias:

Show the code
# extract regression slope for a given method and repetition
extract_coef <- function(res_list, method) {
  map_dfr(res_list, function(x) {
    if (!is.null(x[[method]])) {
      if (method == "bmlvar") {
        tmp <- x[[method]]$reg_bmlvar$regression_slope$median
        tibble(
          `0.0` = tmp[4],
          `0.2` = tmp[5],
          `0.4` = tmp[6],
          method = "BmlVAR"
        )
      } else {
        tibble(
          `0.0` = tryCatch(summary(x[[method]]$reg_outstrength[[1]])$coefficients[2], error = function(e) NA),
          `0.2` = tryCatch(summary(x[[method]]$reg_outstrength[[2]])$coefficients[2], error = function(e) NA),
          `0.4` = tryCatch(summary(x[[method]]$reg_outstrength[[3]])$coefficients[2], error = function(e) NA),
          method = method
        )
      }
    } else {
      NULL
    }
  })
}


methods <- c("gvar", "gimme", "mlvar", "bmlvar")
df_all <- map_dfr(methods, ~ extract_coef(r3$results, .x))


df_all <- df_all |>
  mutate(method = recode(method,
                         "mlvar" = "mlVAR",
                         "gimme" = "GIMME",
                         "gvar" = "GVAR")) |> 
  mutate(method = factor(method, levels = c("GVAR", "GIMME", "mlVAR", "BmlVAR")))

# Plot
est_variability_plot <- df_all |>
  pivot_longer(cols = c("0.0", "0.2", "0.4"), names_to = "reg_coef") |> 
  mutate(
    reg_coef = as.factor(reg_coef)
    # method = factor(method, levels = names(meth_colors))
  ) |> 
  ggplot(aes(x = reg_coef, y = value, fill = method, color = method)) + 
  geom_point(position = position_jitterdodge(jitter.width = 0.35, 
                                              dodge.width = 0.8, 
                                              jitter.height = 0),
             size = 1.5,
             alpha = 0.4) +
  geom_boxplot(position = position_dodge(width = 0.8), 
               outlier.shape = NA, alpha = 0.7) + 
  labs(x = "True Regression Slope",
       y = "Estimated Regression Slope", 
       color = "",
       fill = "") +
  theme_centrality() +
  theme(text = element_text(size = 21)) +
  scale_fill_manual(values = meth_colors) +
  scale_color_manual(values = meth_colors) + 
  scale_y_continuous(breaks = round(seq(-0.6, 1, by = .2), 1))+
  geom_vline(colour = "#F3F4F5", xintercept = seq(1.5, 3, 1))

ggsave("est_variability_plot.pdf", plot = est_variability_plot, 
       height = 6, width = 9,
       path = here::here("figures/"), device = "pdf")

10 MCMC Diagnostics

We can now look at the Bayesian models in more detail and check if the Rhats are satisfactory, as well as if there were any divergent transitions.

First, look at Rhats:

Show the code
sim_results |> 
  select(bmlvar_diagnostics.rhat_bmlvar_mean) |> 
  rename("Mean Rhat" = bmlvar_diagnostics.rhat_bmlvar_mean) |> 
  knitr::kable()
Mean Rhat
1.0002140
0.9996671
1.0001584
0.9998736

Second, we tabulate divergent transitions:

Show the code
sim_results |> 
  select(contains("divtrans")) |> 
  rename_with(~ str_remove(., "bmlvar_diagnostics."), everything()) |> 
  knitr::kable(col.names = c("Mean Number of DivTrans",
                            "Sum of DivTrans",
                            "Models with Divtrans"))
Mean Number of DivTrans Sum of DivTrans Models with Divtrans
4.390 878 100
0.005 1 1
1.310 262 26
0.000 0 0

11 Session Info

Please note that the visualizations were redone in an updated R version, but the computations of the simulation were run in a previous R version as indicated in the Dockerfile.

Show the code
pander::pander(sessionInfo())

R version 4.5.0 (2025-04-11)

Platform: x86_64-pc-linux-gnu

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: lm.beta(v.1.7-2), here(v.1.0.1), MetBrewer(v.0.2.0), pander(v.0.6.6), ggh4x(v.0.2.8), janitor(v.2.2.1), ggokabeito(v.0.1.0), showtext(v.0.9-7), showtextdb(v.3.0), sysfonts(v.0.8.9), cowplot(v.1.1.3), SimDesign(v.2.19.2), lubridate(v.1.9.4), forcats(v.1.0.0), stringr(v.1.5.1), dplyr(v.1.1.4), purrr(v.1.0.4), readr(v.2.1.5), tidyr(v.1.3.1), tibble(v.3.2.1), ggplot2(v.3.5.2) and tidyverse(v.2.0.0)

loaded via a namespace (and not attached): gtable(v.0.3.6), xfun(v.0.52), htmlwidgets(v.1.6.4), tzdb(v.0.5.0), vctrs(v.0.6.5), tools(v.4.5.0), generics(v.0.1.3), curl(v.6.2.2), parallel(v.4.5.0), pkgconfig(v.2.0.3), R.oo(v.1.27.0), lifecycle(v.1.0.4), farver(v.2.1.2), compiler(v.4.5.0), textshaping(v.1.0.0), brio(v.1.1.5), munsell(v.0.5.1), codetools(v.0.2-20), snakecase(v.0.11.1), htmltools(v.0.5.8.1), yaml(v.2.3.10), pillar(v.1.10.2), R.utils(v.2.13.0), sessioninfo(v.1.2.3), parallelly(v.1.43.0), tidyselect(v.1.2.1), digest(v.0.6.37), stringi(v.1.8.7), future(v.1.40.0), listenv(v.0.9.1), labeling(v.0.4.3), rprojroot(v.2.0.4), fastmap(v.1.2.0), grid(v.4.5.0), colorspace(v.2.1-1), cli(v.3.6.4), beepr(v.2.0), magrittr(v.2.0.3), future.apply(v.1.11.3), withr(v.3.0.2), scales(v.1.3.0), timechange(v.0.3.0), rmarkdown(v.2.29), globals(v.0.17.0), audio(v.0.1-11), progressr(v.0.15.1), ragg(v.1.4.0), R.methodsS3(v.1.8.2), hms(v.1.1.3), pbapply(v.1.7-2), evaluate(v.1.0.3), knitr(v.1.50), testthat(v.3.2.3), rlang(v.1.1.6), Rcpp(v.1.0.14), xtable(v.1.8-4), glue(v.1.8.0), svglite(v.2.1.3), rstudioapi(v.0.17.1), jsonlite(v.2.0.0), R6(v.2.6.1) and systemfonts(v.1.2.2)